Evaluating Ranking Diversity and Summarization in Microblogs using Hashtags
نویسندگان
چکیده
Diversification techniques for web search have recently been developed that assume that, for each query, there is a set of underlying aspects or subtopics that address specific user intents. These techniques attempt to balance the relevance of the retrieved documents with the coverage of the aspects. Evaluation of diversification techniques requires some way of defining a set of aspects for each test query and a “gold standard” assignment of documents to aspects. This has made the study of diversification difficult for new data such as microblogs. A related task, keyword-based summarization, is important for microblogs but also has problems in evaluation. In this paper, we describe an approach to evaluating ranking diversity and summarization in microblogs by assuming hashtags correspond to subtopics. We show the viability of this approach to evaluation, and validate the assumption that hashtags are subtopics. The results show that, despite the differences in content, the best techniques for search diversification with microblogs are the same as with web pages. The summarization results confirm that the DSPapprox technique is effective and that phrase-based summarization techniques perform somewhat worse than single words in terms of covering the underlying aspects.
منابع مشابه
Summarization: (1) Using MMR for Diversity- Based Reranking and (2) Evaluating Summaries
This paper 1 develops a method for combining queryrelevance with information-novelty in the context of text retrieval and summarization. The Maximal Marginal Relevance (MMR) criterion strives to reduce redundancy while maintaining query relevance in reranking retrieved documents and in selecting appropriate passages for text summarization. Preliminary results indicate some benefits for MMR dive...
متن کاملAutomatic Hashtag Recommendation for Microblogs using Topic-Specific Translation Model
Microblogging services continue to grow in popularity, users publish massive instant messages every day through them. Many tweets are marked with hashtags, which usually represent groups or topics of tweets. Hashtags may provide valuable information for lots of applications, such as retrieval, opinion mining, classification, and so on. However, since hashtags should be manually annotated, only ...
متن کاملPersonalized Hashtag Suggestion for Microblogs
In microblogging services, users can generate hashtags to categorize their tweets. However, a majority of microblogs do not contain hashtags, which has intrigued active research on the problem of automatic hashtag recommendation for microblogs. Previous work conducted on this problem mostly does not take the user’s preference into consideration. In this paper, we propose a novel personalized ha...
متن کاملBe In The Know: Connecting News Articles to Relevant Twitter Conversations
In the era of data-driven journalism, data analytics can deliver tools to support journalists in connecting to new and developing news stories, e.g., as echoed in microblogs such as Twitter, the new citizen-driven media. In this paper, we propose a framework for tracking and automatically connecting news articles to Twitter conversations as captured by Twitter hashtags. For example, such a syst...
متن کاملCollective Opinion Target Extraction in Chinese Microblogs
Microblog messages pose severe challenges for current sentiment analysis techniques due to some inherent characteristics such as the length limit and informal writing style. In this paper, we study the problem of extracting opinion targets of Chinese microblog messages. Such fine-grained word-level task has not been well investigated in microblogs yet. We propose an unsupervised label propagati...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015